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REVIEW 

Genome-wide mechanisms of Smad binding 

M Morikawa 1,2 , D Koinuma 2 , K Miyazono 1,2 and C-H Heldin 1 

A dual role of transforming growth factor p (TGF-(3), to both suppress and promote tumor progression and metastasis, has been 
well established, but its molecular basis has remained elusive. In this review, we focus on Smad proteins, which are central 
mediators of the signal transduction of TGF-p family members. We describe current knowledge of cell-type-specific binding 
patterns of Smad proteins and mechanisms of transcriptional regulation, obtained from recent studies on genome-wide binding 
sites of Smad molecules. We also discuss potential application of the genome-wide analyses for cancer research, which will allow 
clarification of the complex mechanisms occurring during cancer progression, and the identification of potential biomarkers for 
future cancer diagnosis, prognosis and therapy. 
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INTRODUCTION 

Members of the transforming growth factor p (TGF-P) family, 
which include three TGF-p isoforms, as well as activins, nodal and 
bone morphogenetic proteins (BMPs), regulate a variety of cellular 
processes including differentiation, proliferation, migration and 
cell death in cell-type-specific and context-dependent manners. 13 
The biological effects of TGF-p family members are highly 
contextual, for example, their responses may differ in different 
tissues, local environments and stage of disease. Since TGF-p 
activates cytostatic and cell death processes that maintain 
homeostasis in mature tissues, it functions as a suppressor of 
epithelial cell tumorigenesis at early stages. Inactivation of the 
TGF-p signaling pathway through mutation and/or loss of 
heterozygosity of TGF-p receptors or Smad proteins has been 
found in certain types of cancer and is related to poor prognosis 
for the patients (reviewed in Levy and Hill 4 ). However, TGF-p 
promotes tumor progression by enhancing migration, invasion 
and survival of tumor cells during the later stages of 
tumorigenesis, through stimulating extracellular matrix 
deposition and tissue fibrosis, perturbing immune and 
inflammatory function, stimulating angiogenesis and promoting 
epithelial-mesenchymal transition (reviewed in Yoshimura et a/. 5 , 
Roberts and Wakefield 5 , Moustakas and Heldin 7 and Miyazono 
et a/. 8 ). Accumulating evidences also indicate critical roles of TGF-p/ 
activin signaling in the maintenance of stem cell-like properties of 
certain cancer-initiating cells, such as glioma-initiating cells, 9,10 
breast cancer-initiating cells, 11 pancreatic cancer-initiating cells, 12 
and leukemia-initiating cells in chronic myeloid leukemia. 13 
Intriguingly, small molecular inhibitors for type I receptors have 
therapeutic effects at least in animal models. 9,1 0,1 2,1 3 These 
observations suggest that targeting the TGF-p/activin signaling 
pathways could be an attractive therapy in certain advanced 
cancers, although it is possible that shutdown of these pathways in 
normal tissues will increase the risk for the development of other 
tumors. Thus, one of the major questions that remain to be 
addressed in this field is what defines the dual role of TGF-p in 
cancer biology. 



Identification of the signaling components of TGF-p family 
members, including membrane receptor serine/threonine kinases 
and Smad transcription factors, has led to an understanding of the 
molecular mechanisms underlying this highly contextual pro- 
cess. 14,15 Genome-wide transcriptome analyses in various cell 
types have identified many target genes that are required for 
ligand-mediated cellular responses. Direct binding of Smad 
complexes was confirmed by in vitro binding assays, promoter 
assays and chromatin immunoprecipitation (ChIP) followed by 
polymerase chain reaction. Until recently, however, regulatory 
elements were mainly identified in the promoter regions of the 
target genes, especially 1-2kb upstream of their transcription 
start sites. 

ChIP with promoter array analysis (ChlP-chip) and ChIP followed 
by sequencing (ChlP-seq) have become powerful tools to analyze 
genome-wide mapping of protein-binding sites and epigenetic 
marks. 16,17 In this case, a DNA sample obtained after ChIP 
procedure is analyzed using promoter-tiling arrays, or massively 
parallel sequencing (Supplementary Figure 1), which provides a 
comprehensive chromatin-binding landscape of target transcrip- 
tion factors. Information obtained by these analyses has shed light 
on previously unrecognized mechanisms and sometimes chal- 
lenged notions previously characterized in a specific situation. 
Recently, several groups have reported that Smad proteins tend to 
co-occupy target sites with cell-type-specific master transcription 
factors. 18 20 The results also indicate that co-occupied regions 
mainly overlap with enhancer elements, although previous studies 
have identified numerous Smad-responsive elements in the 
promoter regions of their target genes. In addition, recent ChlP- 
chip/ChlP-seq studies have identified a group of direct target 
genes, or target gene signatures, in specific cell types and cellular 
contexts. Intriguingly, Kennedy et al. 2 ^ reported that the TGF-p/ 
Smad4 target gene signature identified in ovarian cancer cell lines 
predicts patient survival. 

In this review, we discuss current knowledge of cell-type- 
specific binding patterns of Smad proteins and mechanisms 
of transcriptional regulation obtained from recent ChlP-chip/ 
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ChlP-seq studies (Supplementary Table 1). We also highlight 
applications of the genome-wide analyses for cancer research. 
These insights contribute to the unraveling of the complex 
mechanisms of TGF-(3 signaling in cancer biology. 



OVERVIEW OF SIGNALING PATHWAYS OF TGF-p FAMILY 
MEMBERS 

The TGF-(3 family consists of 33 members in mammals. Two 
types of serine/threonine kinase transmembrane receptors, that 
is, type II and type I receptors, are required for intracellular 
signal transduction by the TGF-p family members. 14 Five type II 
receptors and seven type I receptors are present in mammals. 22 
Ligand binding assembles specific type II and type I receptors into 
heterotetramers. Then the type II receptor transphosphorylates 
and activates the type I receptor, which subsequently transduces 
the signal by phosphorylating the carboxyl terminus of receptor- 
regulated (R)-Smad. In most cell types, TGF-p and activin induce 
phosphorylation of Smad2 and Smad3 (activin/TGF-p-specifk 
R-Smads, or AR-Smads) and BMPs induce phosphorylation of 
Smadl, Smad5 and Smad8 (BMP-specific R-Smads, or BR-Smads). 
Activated R-Smads form heterooligomeric complexes with 
common-partner (co)-Smad (Smad4). The complexes translocate 
into the nucleus where they regulate the expression of target 
genes, such as the genes for Serpinel (plasminogen activator 
inhibitor-1), inhibitory (l)-Smads (Smad6 and Smad7) and Idl 
(inhibitor of differentiation-1 or inhibitor of DNA binding-1) 
(Figure 1). Because of their relatively low DNA-binding affinity, 
Smad complexes interact with a wide variety of DNA-binding 
proteins and cooperatively regulate a synexpression group of 
target genes (Figure 2a) 2 So far, several transcription factors, such 
as AP-1, 23 ETS, 24 ' 25 basic helix-loop-helix proteins, 26 ' 27 C/EBPp, 28 



FoxHV 



and FoxO have been identified and validated 



as important cofactors of TGF-p/BMP signaling pathways. 
In addition, Smad complexes recruit coactivators, such as p300 
and CREB-binding protein, 32,33 or corepressors, such as ATF-3. 34 
For example, TGF-p represses transcription of the Idl gene in 
epithelial cells through formation of a complex with ATF-3, while 
TGF-p induces Idl in cells which do not express ATF-3, such as 
glioma-initiating cell-like cells 35 Since ATF-3 is induced by tumor 
necrosis factor-a, signaling crosstalk between TGF-p and tumor 
necrosis factor-a pathways determines the transcriptional 
regulation of Idl. Thus, crosstalk with other signaling pathways 
and interaction with other DNA-binding cofactors define the 
specific binding patterns of Smads; in addition, interaction with 
coactivators/corepressors modulates their transcriptional activity 
(Figure 1). 

Smad proteins are targets of protein modifications, such as 
phosphorylation, ubiquitination and ADP-ribosylation. The cyclin- 
dependent kinases (CDKs) CDK8 and CDK9, which are downstream 
effectors of extracellular-signal-regulated kinase (ERK) MAP kinase, 
phosphorylate the linker region of Smads in the nucleus. 36-39 
Glycogen synthase kinase-3p (GSK3P) also phosphorylates the 
linker region of Smads, which requires priming phosphorylation 
by ERK MAP kinase. 40 These phosphorylations mark the proteins 
for polyubiquitination and promote proteasome-mediated 
degradation of Smad complexes. Several WW domain proteins 
have been reported to recognize the phosphorylated linker 
regions and interact with R-Smads. 41 Smurfl is a member of the 
E3 ubiquitin ligase family, which can target BR-Smads for 
degradation 42 while NEDD4L (also known as NEDD4-2) is an E3 
ubiquitin ligase for AR-Smads 43-44 Consequently, endogenous ERK 
MAPK and GSK3P signaling pathways are able to antagonize Smad 
activity through proteasome-mediated degradation. Recently, 
deubiquitinating enzymes (DUBs) for Smad proteins have been 
identified. 45 ' 46 Monoubiquitination of the lysine-519 (K519) 
residue of Smad4 prevents its association with R-Smads and 
negatively regulates TGF-p/BMP signaling pathway. USP9x (also 
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Figure 1. Signaling of TGF-p family members through Smad 
complexes. Smad proteins are central mediators of the signal 
transduction of TGF-p family members. Ligand binding assembles 
specific type II and type I receptors into heterotetramers. The type II 
receptor transphosphorylates (P) and activates the type I receptor, 
which subsequently activates receptor-regulated (R)-Smads. Acti- 
vated R-Smads form heterooligomeric complexes with common- 
partner (co)-Smad. In the nucleus, Smad complexes interact with 
DNA-binding cofactors and cooperatively regulate a group of target 
genes. Crosstalk with other growth regulatory factors affects the 
specific binding patterns and transcriptional activity of Smads. 



known as FAM) has been identified as a DUB that reverts this 
modification 45 R-Smads are monoubiquitinated in their DNA- 
binding domains, which attenuates their affinity for DNA. This 
monoubiquitination is opposed by another DUB, USP15. 46 
Recently, Lonn et al. 47 found that Smad proteins are targets of 
ADP-ribosylation. Poly(ADP-ribose) polymerase-1 (PARP-1) 
interacts with and ADP-ribosylates Smad3 and Smad4 in the 
nucleus, and affects the binding affinity of Smad complexes in a 
context-dependent manner. 7,48 Thus, posttranslational 
modifications of Smad proteins affect their signal transduction 
capacities; some of these modifications are regulated by other 
signaling pathways (Figure 1). 



SMAD-BINDING MOTIFS 

The R-Smads and Smad4 are composed of two evolutionally 
conserved domains named Mad Homology 1 and 2 (MH1 and 
MH2). The MH2 domain plays an important role for the formation 
of heterooligomeric Smad complexes and transcriptional 
activation, whereas the MH1 domain is responsible for 
sequence-specific DNA-binding activity. Using a polymerase chain 
reaction-based random-oligonucleotide selection process, an 8-bp 
palindromic DNA sequence, GTCTAGAC, was identified as a Smad3 
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and Smad4 binding motif. 49 In contrast to Smad3 and Smad4, 
Smad2 does not directly bind to DNA due to steric hindrance by 
an inserted sequence in the DNA-binding region. 50 The crystal 
structures of the MH1 domain of Smadl and Smad3 have revealed 
that R-Smads recognize and directly bind to half of the 
palindrome, that is, GTCT or AGAC sequences, through an 
1 1-amino-acid residue (3-hairpin loop in the MH1 domain. 51-53 
The amino-acid sequences of the loop are completely conserved 
among R-Smads and show a high level of similarity between 
R-Smads and Smad4. The half-site sequences are usually referred 
to as the CAGA box or Smad binding element (SBE). Recent ChlP- 
chip/ChlP-seq studies have confirmed that the SBE is enriched in 
the Smad2/3-binding regions. 18 ' 24 - 26 - 54 ' 55 

Although the MH1 domain of Smadl has high affinity for 
SBE, 52,53 BR-Smads seem to prefer a GC-rich sequence, such as 
GCCGnCGC, which was originally identified in Drosophila. 56 
In mammals, GC-rich sequences, such as GCCG and (T)GGCGCC, 
have been identified in the promoter regions of several BMP 
target genes. Using a de novo motif-finding method, we identified 
a Smadl/5-binding motif, which is consistent with the previously 
reported GC-rich sequences and thus named as GC-rich SBE (GC- 
SBE) 57 Importantly, both GC-SBE and SBE are enriched in the 
Smadl/5-binding sites identified in both endothelial cells (ECs) 
and pulmonary arterial smooth muscle cells (PASMCs). 57 Since 
binding motifs for R-Smads have been identified in vitro and 
in vivo, candidate Smad-binding sites can be predicted in the 
promoter regions of the target genes. However, these motifs are 
common throughout the genome, and the majority of them are 
not occupied by R-Smads when examined using ChlP-chip/ChlP- 
seq. Thus, additional mechanisms operate to determine the 
binding patterns of Smads. 

FACTORS THAT DETERMINE THE BINDING PATTERNS OF 
SMADS 

Recent studies have suggested that Smad complexes colocalize 
with master transcription factors that specify and maintain cell 
identities. 18 20 Chen ef a/. 20 pointed out that Smadl colocalizes in 
the multiple transcription factor-binding loci with embryonic stem 
(ES) cell-specific transcription factors, such as Oct4 and Sox2 in 
mouse ES cells (mESCs). Mullen ef a/. 18 reported that binding 
regions of Smad3 also overlap with those of Oct4 in both human 
and mouse ES cells. Intriguingly, at least some of these co- 
occupied regions are still enriched after tandem ChlP-re-ChIP 
experiments, indicating that Oct4 and Smad3 bind to similar 
regions in mESCs simultaneously. 18 Moreover, Smad3 colocalizes 
with MyoD (encoded by Myodl) or PU.1, master transcription 
factors controlling muscle or hematopoietic differentiation, 
respectively, in specific cell types which express these genes; 
forced expression of MyoD in mESCs is sufficient to redirect 
Smad3 to muscle specific binding sites, where they colocalize. 18 
In addition, Trompouki ef o/. 19 reported that induction of the 
myeloid lineage regulator C/EBPa shifted Smadl to sites newly 
occupied by C/EBPa in the human erythroleukemia cell line K562. 
Overexpression of the erythroid regulator GATA1 restricts Smadl 
binding to erythroid genes, while binding to genes expressed in 
other lineages is diminished. 19 These findings suggest that Smad 
complexes are passively recruited to cell-type-specific binding 
sites through the interaction with master transcription factors. 

On the other hand, we recently found that HNF4ot, one of the 
master regulators of hepatocyte differentiation and liver function, 
contributes to the hepatocyte-specific binding pattern of 
Smad2/3. 58 Interestingly, 32.5% of the Smad2/3-binding regions 
overlapped with those of HNF4a. This is against the simple model 
in which cell-type-specific master regulators recruit R-Smads to 
their binding sites and determine their function. In addition, 
through the analysis of the distances between the Oct4 peak and 
the peaks of Sox2 and Smad3 in mESCs, Mullen ef a/. 18 found that 
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Oct4 sites are more closely associated with Sox2 sites than Smad3 
sites, suggesting that Oct4 and Smad3 do not interact in a direct 
manner. They revealed that nucleosomes were relatively depleted 
at the sites co-occupied by cell-type-specific master transcription 
factors and Smad3, and hypothesized that master transcription 
factors increase the accessibility of SBEs and contribute to Smad3 
binding. Intriguingly, MyoD binding has been reported to be 
associated with local histone acetylation. 5 PU.1 and C/EBPa 
binding has been reported to induce nucleosome remodeling, 
followed by monomethylation of H3K4. 60 John ef al. reported 
that cell-type-specific glucocorticoid receptor binding patterns are 
comprehensively predetermined by cell-specific differences in 
baseline chromatin accessibility patterns, with secondary 
contributions from local sequence features. Similarly, comparison 
of Smad1/5-binding patterns of ECs and PASMCs suggested that 
the endothelial-specific binding pattern of Smadl /5 is 
predetermined by baseline chromatin accessibility patterns. 57 
Thus, these facts support the notion that Smad complexes 
determine their target sites together with other DNA-binding 
cofactors in two different ways: (1) cell-type- or lineage-specific 
transcription factors, or pioneer factors, 62 open up local chromatin 
structure to make SBE and GC-SBE accessible and (2) DNA-binding 
cofactors, induced and activated in context-dependent 
manner, strengthen the interaction between Smad and DNA 
(Figure 2b). 

Intriguingly, it has been observed that different levels of 
activation of Smad signaling pathways cause different binding 
patterns of Smad complexes, possibly correlating to the amount of 
activated Smad complexes in the nucleus. 63 It has been well 
described that different concentrations of activin regulate the 
expression of distinct subsets of target genes. 4 Lee ef a/. 54 
confirmed that phospho-Smad2 is dose-dependently able to bind 
to different subsets of target genes and regulate their 
transcription in mESCs. Comparing the ChlP-seq data of different 
BMP isoforms in ECs, we found that each binding site has different 
binding affinity for Smad complexes and that the strength 
of Smadl/5 signaling affects the number and distribution of 
Smad-binding sites over the genome. 57 Thus, these findings 
suggest that a distinct dose-dependency occurs in the regulation 
of different subsets of target genes, which may cause phenotypic 
change. 

SMAD BINDING AND HISTONE MODIFICATION MARKERS 

As discussed above, local chromatin structure or accessibility 
affects the binding patterns of Smads. Recent studies have 
emphasized the importance of enhancers for the precise 
regulation of expression of target genes. 1 8-20,54,57 On the other 
hand, several groups have found that most of the Smad-binding 
sites are located at promoters of known genes. 30,65 ' 66 Kim ef al. 30 
reported that 50-60% of Smad2/3 binding occurs in exons and 
promoters in human ES cells (hESCs), while only 10-15% of Smad 
binding occurs in exons and promoters in derived endoderm. This 
finding suggests that the preference of binding pattern of Smads 
to either promoters or enhancers is modulated by the 
differentiation stages. 

Smad proteins have also been shown to induce local chromatin 
remodeling and modification at their binding sites. Both Smad1/5 
and Smad2/3 have been reported to physically interact with a 
histone demethylase, KDM6B (also known as JMJD3), to recruit it 
to the NOG (encoding noggin) and NODAL promoter regions, 
respectively, and to cause the loss of the repressive mark histone 
H3 lysine-27 trimethylation (H3K27me3) in mESCs. 67 ' 68 Recently, 
Kim ef al. 30 reported that Smad2/3 and KDM6B are simultaneously 
enriched in the GSC (encoding goosecoid) and EOMES (encoding 
eomesdermin) promoter of hESCs after activin treatment, followed 
by the loss of the H3K27me3 repressive mark (Figure 3a). 
Interestingly, Fei ef al. 65 identified that KDM6B is one of 
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Figure 2. Factors that determine the binding patterns of Smads. (a) A group of genes that are simultaneously regulated by a specific Smad- 
cofactor complex is known as a synexpression group. Distinct combinations of DNA-binding cofactors in different contexts determine the set 
of genes regulated by Smad complexes, (b) Cell-type- or lineage-specific master transcription factors (purple) open up local chromatin 
structure to make Smad-binding regions (red) accessible. The master transcription factors also physically interact with Smads and, in some 
cases, recruit them to their binding sites. DNA-binding cofactors, induced and activated in context-dependent manner, strengthen the 
interaction between Smad and DNA. Interaction with coactivators/corepressors also affects the regulation of their target genes. A full colour 
version of this figure is available at the Oncogene journal online. 



the BMP4-modulated early neural differentiation regulators, 
suggesting that loss of repressive histone marks through the 
Smad-KDM6B pathway explains the transcriptional regulation 
especially at later time points. 

In addition to sequence-specific DNA-binding transcription 
factors, histone code reader proteins, which are recruited and 
bound to specific histone modifications, are reported to help to 
determine the binding sites of Smad proteins. Massague and 
colleagues have reported that tripartite motif 33 (TRIM33, also 
known as TIFly or Ectodermin), physically interacts with Smad2 
and Smad3, to make a TRIM33-Smad2/3 complex without 
Smad4. 69 The TRIM33 contains an N-terminal RING finger/B-box/ 
coiled coil (RBCC) or TRIM domain, and a plant homeodomain 
(PHD) zinc finger and a Bromo domain in the C-terminus. They 
reported that the PHD-Bromo cassette recognized histone H3 
lysine-9 trimethylation (H3K9me3) and H3 acetylation especially at 
lysine residues 18 and 23 (H3K18ac and H3K23ac). During mESC 
differentiation, nodal signaling triggered TRIM33-Smad2/3 com- 
plex formation. The TRIM33-Smad2/3 complex recognizes and 
binds to H3K9me3-K18ac dual histone marks and displaces the 
chromatin-compacting factor heterochromatin protein 1y (HPly) 
in the CSC and MIXL1 promoters, resulting in the remodeling of 
the local chromatin structure (Figure 3b). 70 Agricola et al. 7 ^ also 
found that TRIM33 recognizes and binds to H3K18ac/K23ac. On 
the other hand, TRIM33 has been reported to bind Smad4 and 
function as a RING-type ubiquitin ligase for Smad4. 72 Consistent 
with this model, Agricola et al. 7 ^ reported that TRIM33 inhibits 
Smad4 function through ubiquitin-mediated degradation of 
Smad4, and that its E3 ubiquitin ligase activity is induced after 
binding to histones. The detailed mechanisms have not been 
settled, but TRIM33 recognizes a specific histone code and 
modulates TGF-pVBMP signaling. Since the relationship between 
Smad proteins and histone modification marks has not been fully 



elucidated on a genome-wide scale, future analyses will address a 
possible mechanistic link between Smad proteins and epigenetic 
marks using ChlP-chip/ChlP-seq approach. 

SMAD BINDING AND GENE REGULATION 

Previous studies have indicated that binding of transcription 
factors detected by ChlP-chip/ChlP-seq experiments are not 
necessarily associated with transcriptional regulation of nearby 
genes (reviewed in Farnham 73 ). It has frequently been observed 
that changing the level of a DNA-binding transcription factor 
alters the expression level of only 1-10% of its potential target 
genes. Most of the recent studies have confirmed that 1-20% of 
Smad-binding sites are associated with the regulation of 
expression of nearby genes. This discrepancy is in part due to 
the fact that mRNA levels do not only reflect transcriptional 
activities, since mRNA levels are also regulated by other biological 
processes, for example, degradation. Another explanation for the 
discrepancy is related to the definition of target genes. Although 
most studies assign binding sites to the nearest gene within 50 kb, 
this is not always the case. For example, Trompouki et al. revealed 
that several transcription factors, including Smadl, cooperatively 
regulate the expression of the hematopoietic gene LM02 through 
binding to the known enhancer region at 72 kb upstream of the 
transcription start site in K562 cells. 19,74 We also observed that 
Smadl/5 bound to a region 57 kb upstream of the transcription 
start site of Smad6 in ECs, as well as the LM02 — 72 kb enhancer. 57 
This region has been reported to be associated with Smad6 
expression in the heart, vasculature and hematopoietic organs, 75 
suggesting that the binding to this region, as well as the promoter 
region, plays an important role in these cell types. Recently, 
methods that characterize the chromatin architecture have been 
developed. Chromosome conformation capture (3C) assays make 



Oncogene (2013) 1609-1615 



© 2013 Macmillan Publishers Limited 



ChlP-chip and ChlP-seq of Smads 
M Morikawa ef al 



1613 



R-Smads fj 

KDM6B (JMJD3) 




H3K27me3 repressive marks 
b GSC and MIXL1 

Smad2/3 
HPly 




dual histone marks 



Accessible 



Figure 3. Smad proteins and histone modification marks. Smad proteins have been reported to induce local chromatin remodeling and 
modification at their binding sites. Several models are described in ES cells, where early developmental genes are poised and ready to be 
activated in response to extracellular signals, such as nodal, (a) R-Smads physically interact with a histone demethylase, KDM6B (also known as 
JMJD3), and recruit it to their target sites, followed by the loss of the H3K27me3 repressive mark (light green), (b) Xi et al. 70 reported that nodal 
signaling triggered TRIM33-Smad2/3 complex formation. The TRIM33-Smad2/3 complex recognizes and binds to H3K9me3-K18ac dual 
histone marks (light blue) and displaces the chromatin-compacting factor HPly (heterochromatin protein ly) in the GSC and MIXL1 promoters, 
resulting in the remodeling of the local chromatin structure to make Smad-binding region(s) (red) accessible. A full colour version of this 
figure is available at the Oncogene journal online. 



it possible to study long-distance regulation of genes by 
enhancers through formation of chromatin loops (reviewed in 
Simonis ef al. 76 ). Application of these technologies will help to 
identify the functional relationship between Smad-binding sites 
and genes implicated in cancer progression. 

It is also possible that for many sites, binding of Smads is not 
sufficient for transcriptional regulation, but additional stimuli are 
required to drive the expression of the target genes. For example, 
costimulation with tumor necrosis factor-a, which induces the 
transcriptional repressor ATF-3, affects the expression regulation 
of the Idl gene and cellular response. 34,35 Sometimes, ligand 
stimulation itself induces these cofactors and makes a feed- 
forward circuit, like in myotube differentiation. The myogenic 
transcription factor MyoD directly regulates genes expressed 
during skeletal muscle differentiation together with other 
transcription factors such as MEF2 77 and Zfp238 (also known as 
RP58). 78 These transcription factors are also induced by MyoD, and 
MEF2 functions with MyoD in a positive feed-forward circuit, 77 
while Zfp238 participates in a negative feed-forward circuit. 78 
Comparison of MyoD-binding patterns of mouse C2C12 myoblasts 
and differentiated myotubes has revealed that most binding 
events in myoblasts are not directly associated with gene 
regulation. However, MyoD binding increases during myogenic 
differentiation at many of the regulatory regions associated with 
genes expressed in skeletal muscle. Intriguingly, the myotube- 
increased binding sites are enriched for MEF2-like motifs, while 
the myotube-decreased peaks are enriched for Zfp238-like 
motifs, 59 consistent with the fact that MEF2 positively and 
Zfp238 negatively cooperate with MyoD. It is possible that TGF- 
P stimulation induces certain transcription factors, which take part 
in feed-forward regulatory loops and cooperatively regulate gene 
expression especially at late time points. 



IDENTIFICATION OF A TGF-0 GENE SIGNATURE 

The notion of 'gene signature' comes from the early work on 
cancer classification and prognosis prediction using genome-wide 
gene expression profiles obtained from microarray analyses of 
cancer patients. 79 Identification of a group of genes that reflect 
the activity of a common function, pathway or other property in a 
specific context, are sometimes more revealing compared with the 
analysis of single genes. Gene expression signatures obtained in 
experimental conditions has proved to subcategorize patients 
and predict their prognosis. Concerning TGF-p, Coulouarn ef al. 80 
reported that TGF-p-responsive genes at late time points, or a 
late TGF-p signature, which were identified in mouse primary 
hepatocytes, successfully discriminate distinct subgroups of 
hepatocellular carcinoma and possess a predictive value for 
hepatocellular carcinoma patients. 

Combination of ChlP-chip/ChlP-seq and genome-wide tran- 
scriptome analyses provides an accurate prediction of target 
genes of Smad proteins. TGF-p family members regulate a variety 
of target genes both directly and indirectly, and modulate many 
biological processes. The chromatin-binding landscape of Smad 
proteins, obtained by ChlP-chip/ChlP-seq, will help to identify 
specific genes that are directly regulated by Smad proteins. It will 
also help to dissect a specific cellular program regulated by TGF-p 
family members, for example, the growth inhibitory and apoptosis 
programs of TGF-p. So far, many groups have identified groups of 
direct TGF-p target genes by using this strategy. Importantly, the 
TGF-p/Smad4 target gene signature identified in an ovarian cancer 
cell line predicts patient survival, based on in silico mining of 
publically available patient data bases. 21 Since TGF-p functions as 
a tumor suppressor in low-grade carcinoma cells, while it 
promotes metastasis in advanced carcinoma cells, a direct 
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comparison of the Smad-binding sites of these two stages of 
tumorigenesis, obtained from experimental models or from cancer 
patients, may reveal specific gene signatures of TGF-p correlating 
to its tumor suppressive and tumor-promoting roles, respectively. 
This may provide us more novel predictive indicators and 
biomarkers for TGF-p targeting treatments. 



CONCLUSIONS AND PERSPECTIVES 

The signaling pathways of TGF-p family members are key players 
in tumorigenesis and cancer progression. TGF-p can function both 
as a tumor-suppressing and a tumor-promoting factor during 
cancer progression. BMP signaling has been reported to play 
critical roles in oncogene-induced senescence, which is part of the 
tumorigenesis barrier and blocks cellular proliferation by inducing 
irreversible growth arrest. 65 Interestingly, BMP signaling induces 
differentiation of certain cancer-initiating cells, such as glioma- 
initiating cells, 81 while TGF-p/activin signaling maintains their 
stem cell-like properties. 9,10 Since Smad proteins are central 
mediators of the signal transduction, studies on global and 
genome-wide binding sites of Smad proteins may reveal 
important insights into their complex biological functions. 

Identification of an appropriate antibody is the first and most 
important step for ChlP-chip and ChlP-seq analyses, because the 
quality of ChIP data depends crucially on the quality of the 
antibody used. 16 Since MH1 and MH2 domains are conserved 
among R-Smads, several specific antibodies for Smad proteins 
recognize their linker region. However, linker regions are targets of 
posttranslational modification and protein interactions, as 
discussed above. It is possible that such changes may attenuate 
the affinities of antibodies under specific conditions. Although 
ChlP-grade antibodies for Smad proteins have been established 
(Supplementary Table 2), careful interpretation of the results will 
be required. 

In summary, genome-wide analysis of the binding sites of Smad 
proteins have led to important discoveries of their cell-type- 
specific and context-dependent functions. Application of genome- 
wide techniques to experimental models and human samples 
derived from cancer patients, will help to clarify their complex 
mechanisms during cancer progression, and may also provide 
potential prognostic biomarkers for future cancer therapy. 
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